Protecting infrastructure networks from cost-based attacks 
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It has been known that heterogeneous networks are vulnerable to the intentional removal of a small fraction 
of highly connected or loaded nodes, which implies that, to protect a network effectively, a few important nodes 
should be allocated with more defense resources than the others. However, if too many resources are allocated 
to the few important nodes, the numerous less-important nodes will be less protected, which, when attacked 
all together, still capable of causing a devastating damage. A natural question therefore is how to efficiently 
distribute the limited defense resources among the network nodes such that the network damage is minimized 
whatever attack strategy the attacker may take. In this paper, taking into account the factor of attack cost, we 
will revisit the problem of network security and search for efficient network defense against the cost-based 
attacks. The study shows that, for a general complex network, there will exist an optimal distribution of the 
defense resources, with which the network is well protected from cost-based attacks. Furthermore, it is found 
that the configuration of the optimal defense is dependent on the network parameters. Specifically, network that 
has a larger size, sparser connection and more heterogeneous structure will be more benefited from the defense 
optimization. 

PACS numbers: 89.75.-k, 89.20.Hh, 05.10.-a 



Introduction. - Modern human societies very much de- 
pend on the efficient functioning and stable operation of com- 
plex infrastructure networks QJJ] . Typical examples are elec- 
trical power grids, telecommunication networks, the Internet, 
and many transportation systems such as road, railway, and 
airline networks. A significant and common feature of these 
networks is that they all possess the heterogeneous degree dis- 
tribution, i.e. they are scale-free networks (SFN) [2]. While 
the adoption of SFN structure could improve the network per- 
formance significantly, e.g. a shorter average network diame- 
ter, it also cause some problems to the network security. For 
instance, it has been shown that the connectivity of a SFN 
could be largely damaged if a small fraction of the large- 
degree nodes are intentionally removed; in contrast, if the re- 
moval is made to the small-degree nodes, the network damage 
will be very limited yfl. The robust-yet-fragile property of 
SFN is more evident when the intrinsic dynamics of the net- 
work flow is taken into account |4[. This has been shown by a 
model of cascade network in Ref. JH, where it is found that, 
due to the existence of the flow dynamics, the removal of even 
a single node could trigger such a large-scale avalanche that 
only a small portion of the nodes survive from the cascading 
failures. Since practical networks typically carry flows, their 
securities against cascading failures thus are of great impor- 
tance, and have drawn many attentions in the past years. The 
topics had been touched include: Model design [6], damage 
estimation [7], dynamics characterization [8], capacity alloca- 
tion |9|], topology dependence [10], and cascade control and 
defense strategies [111. 
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Problem formulation. - While the fragility of SFN to in- 
tentional node removal has been well addressed, so far the 
studies have been concentrating on only the case of "techni- 
cal" failures, instead of the real attacks. More specifically, 
the previous studies are interested in comparing the extents 
of the network damage caused by different implementations 
of attacks, while neglecting the cost required in doing so. In a 
practical situation of network security, the attacker and the de- 
fender are just the two sides of the game. Their purposes are 
the same in a sense, i.e., to maximize the gains with the lim- 
ited resources. The defender, knowing the important roles of 
the large-degree nodes, of course will allocate more defense 
resources to them; and the attacker, while desiring to attack 
the large-degree nodes, has to scruple about the higher cost in 
doing so. Thus in a real attack, the attacker will balance be- 
tween the network damage and the attack cost, and search for 
an effective attack. For example, by the cost of attacking an 
important and well-protected node, the attacker may turn to 
attacking a number of non-important and less-protected nodes 
all together, while the latter may generate the larger damage. 
So, before taking an action, the attacker will do some analysis 
to the network security, so as to find the security weak point. 

To analyze the network security, the attack usually will de- 
sign a series of virtual attacks based on some of the network 
information, e.g. the network structure and the defense con- 
figuration, and then evaluate the possible damages caused by 
the attacks. After a comparison of the damages, the attack 
will figure out the most damaging attack and put it into ac- 
tion. Generally, the virtual attacks are designed according to 
two strategies: (1) Concentrating all the effort to attack a few 
important and well-protected nodes; and (2) distributing the 
effort to a number of non-important and less-protected nodes. 
We call the former concentrated attack (CA) strategy, and the 
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latter distributed attack (DA) strategy. It is straightforward to 
see that, if the nodes are equally protected, the network will be 
vulnerable to CA; in contrast, if too many defense resources 
are allocated to the important nodes, the network will be vul- 
nerable to DA. Now a challenge faced by the defender is: How 
to optimize the network defense so that the network damage is 
minimized whatever attack strategy the attacker may take ? 

The problem of cost-based attacks can be formulated as fol- 
lows. Let P = {pi, i = 1, N} be the existing defense of 
an infrastructure network consisting of N nodes. The defense 
resources allocated to node i is p^ So the total amount of the 
network defense is R = J^iLi Pi- ^ n trie current study, we as- 
sume that the attacker has the full knowledge of the network, 
including the network topology, the flow dynamics, and the 
defense distribution (the general case will be discussed later). 
Based on these information, the attacker will scheme out a se- 
ries of virtual attacks, A n = {a n j,j = 1, .... N'}, based on 
either the CA or DA strategy. In the attack A n , N' out of N 
nodes in the network will be selected as the targets, and the 
cost for removing target j is denoted by a n j- The total attack 

cost of A n therefore is E n = Ylj=i a n,j = E> which is iden- 
tical for all the attacks pointing to the defense P. In general, 
we have E <C R- The network damage caused by A n is de- 
noted by D n = {b n .i, I = 1, M}, where {1} is the set of 
the failed nodes due to the attack A n , and b n j is the amount of 
network damage due to the failure of I. Then the total network 
damage caused by A n can be quantified: B n = YldLi bn,i- 
Evaluating the damage of each of the virtual attacks, finally 
the attacker will identify the most devastating attack. 

The optimal defense is defined as follows. If the defense re- 
sources are distributed in such a way that all the virtual attacks 
generate the same amount of network damage, then this distri- 
bution of defense resources is called the optimal defense, and 
the network is regarded as secure to cost-based attacks. Oth- 
erwise, if there is difference between the network damages, 
the distribution will be considered as not optimal and the net- 
work is regarded as vulnerable to cost-based attacks. Putting 
alternatively, if by changing the attack strategy the attacker 
can increase the network damage, the network is considered 
as not securely protected. 

The model. - We implement the above idea of network se- 
curity by a model of cascade network 1511 (the generalization 
to the other models are straightforward yl]). Let Li(0) be the 
transmission load (betweenness centrality) of node i, which 
accounts for the total number of shortest paths passing though 
i in the original network 11211 . Define the node capacity as 
Ci = (1 + a)Li(O), which stipulates the maximum load that 
node i can handle, a > is the tolerance parameter. Once 
a node is attacked, it will be removed out from the network, 
together with the links that associate to it. Because of node re- 
moval, the shortest pathes of the network will be redistributed 
and, consequently, the load of the remaining nodes will be 
updated. In this process, any node which is overloaded, i.e. 
Li(t) > Ci, will be removed out from the network. The new 
removal will cause a new distribution of the shortest pathes, 
thus generating another wave of node failures, and so on and 
so forth, till no node is overloaded in the remaining network. 
To fit this model into our problem of cost-based attacks, it is 



necessary to make a few assumptions. Firstly, it is assumed 
that the defense resources have the following power-law dis- 
tribution, 

Pi = R x Cf/Clfi), (1) 

where R is the total defense of the network, and C((3) = 
Cf is a normalizing factor which is dependent of the pa- 
rameter j3. Without loss of generality, here we set R — C(f3 = 
1), i.e. the network defense equals the network capacity. Sec- 
ondly, it is assumed that the cost for removing a node is equiv- 
alent to the node defense, i.e., a; = pi. Finally, it is assumed 
that the network damage relies on only the removed nodes. 
In the current study, the network damage is measured by two 
quantities: (1) The size of the largest component in the re- 
maining network, G, and (2) the total capacity of the removed 
nodes, B — Yli=i ^ ^ s emphasized that these assumptions 
are made for only the purpose of illustration. In real applica- 
tions, they should be redefined accordingly to the real prob- 
lems. The key parameter in this model therefore is j3, which 
gives the distribution of the defense resources. When (3 <C 0, 
the important (high-load) nodes will be not allocated with the 
sufficient resources, making the network vulnerable to CA. In 
contrast, if (3 ^S> 0, the important nodes will be overprotected, 
making the network vulnerable to DA. So, to protect the net- 
work from cost-based attacks efficiently, the value of j3 should 
be properly set. 

We next describe the method used in our analysis of the 
network security. Noticing the fact that the virtual attacks are 
divided into two classes, CA and DA, the network security 
thus can be evaluated by considering the two representative 
attacks. For CA, we will choose to attack the single node of 
the largest capacity (highest protection) in the network; while 
for DA, with the same amount of attack cost, we will choose 
to attack a group of nodes of the smallest capacity (lowest 
protection). Specifically, if nodes are ranked by an ascending 
order of the node capacity, i.e. C\ < C% < . . . < Cn, then 
in CA only node N is attacked, while in DA nodes from 1 
to N' will be attacked all together. Here TV' is a number to 

be determined by the relation Yli=i a i — a N- Please note 
that in a real situation it is possible that the most devastating 
attack is neither of the above representative attacks. However, 
such a devastating attack, if exists, will be very dependent on 
the network particulars, and should be always treated case by 
case fiUl . 

Numerical results. - To simulate the cost-based attacks, 
firstly we generate a SFN by the model proposed in Ref. J2l. 
The network consists of N — 3000 nodes and has average 
degree (A:) = 4. The degree distribution follows a power-law 
scaling P(k) ~ fc 7 , with 7 = —3. Secondly, we calculate 
the transmission load of each node and, according to the value 
of a, calculate the node capacity. For illustration, here we 
set a = 0.3. Then we can obtain the total defense of the 
network R, which in our model is set to be the total network 
capacity, i.e. R = J^i^i- Thirdly, we choose a value for 
f3 and, according to Eq. (Q]), distribute the defense resources 
among the nodes. Fourthly, we analyze the network security 
by the above mentioned two representative attacks, and record 
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FIG. 1 : (Color online) For SFNs of size N = 3000, average degree 
(k) = 4, and tolerance parameter a — 0.3, the dependence of the 
network damage on parameter (3 for CA and DA. (a) Gi,2 versus /3. 
The optimal defense is found at about /3 9 ~ 1.25. Inset: p g versus /3. 
(b) Bi t 2 versus (5. The optimal defense is found at about f3b ~ 1.28. 
Please note the semi-logarithmic plot of Bi,2. Inset: p;, versus j3. 
Each date is averaged over 50 network realizations. 



their damages Gip and B\^, with the subscripts 1 and 2 stand 
for CA and DA, respectively. Finally, by scanning (3, we are 
able to figure out the location of the optimal defense, i.e., the 
value of (3 where the two attacks generate the same network 
damage. 

The variations of G and B as a function of (3 are plotted 
in Figs. 1. For the measurement G, the optimal defense is 
found at about (3 g w 1.25 [Fig. 1(a)]; while for the measure- 
ment B, the optimal defense is found at about fa w 1.28 [Fig. 
1(b)]. Please note that the optimal defense is only meaningful 
to the defender, as it tells how to configure the defense re- 
sources against the cost-based attacks. While for the attacker, 
by knowing the specific network defense (the value of fa), the 
only task is to figure out which attack is more damaging, DA 
or CA. For instance, if the attacker is interested in a larger 
damage of network capacity and have learned that the net- 
work defense parameter is (3 = 0.5, after a comparison of the 
virtual attacks, the attacker will find that using CA will cause 
a larger damage than DA [Fig. 1(b)]. 

It is important to note that, in our design of numerical sim- 
ulations, CA is always implemented by removing the single 
node of the largest capacity. That is the reason why the net- 
work damage caused by CA is constant in Fig. 1. How- 
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FIG. 2: (Color online) The dependence of fa (characterized by the 
point where i?2(/3) = B\({3)) on the network parameters. It is found 
that f3b is increasing with N, but is decreasing with a, 7, and (k). 
Each data is averaged over 50 network realizations. 



ever, as f3 increases, the cost for removing the largest-capacity 
node is monotonically increased, i.e., E — ~ C 1 ^. This 
arises the problem of attack efficiency, which is defined as the 
amount of network damage per unit of the attack cost. For 
measurement G, it is defined as p g = (N — Gm)/E, with 
Gm — min(Gi, G2); for measurement B, it is defined as 
Pb = Bm/E, with Bm = max(i?i, B2). Interestingly, it is 
found that, at the optimal defense, the attack efficiency is also 
minimized (the insets of Fig. 1). Now we see that, with the 
optimal defense, the network is protected from not only the 
attack strategy, but also the attack efficiency. 

Physically, the meaning of the optimal defense can be un- 
derstood as follows. When f3 is small, say for example (3 ss 0, 
the network nodes are equally protected regardless of their im- 
portance level. To generate a large damage, the attacker will 
certainly choose to attack the important nodes, i.e. adopting 
CA. As (3 increases, more defense resources will be shifted 
to the important nodes and, correspondingly, the defense of 
the non-important nodes will be weakened. However, as long 
as (3 < f3 g .b, the damage caused by CA will be still larger to 
that of DA. So in this range CA will be always the choice 
for the attacker. Nevertheless, as (3 increases, the damage 
difference between CA and DA will be gradually narrowed. 
Then, at the optimal defense j3 3i b, both attacks will generate 
the same amount of network damage. Since at this point the 
attacker can not benefit from changing between the attacks, 
the cost-based attacks are considered as failed. After that, as 
(3 increases from f3 g ^, the minority important nodes will be 
overprotected, and the majority non-important nodes will be 
less protected. Noticed of this, the attacker will switch the at- 
tack from CA to DA, so as to achieve a larger damage. In the 
extreme situation of (3 « 00, all the defense resources will be 
allocated to the single node of the largest capacity, while the 
other nodes of the network can be easily attacked all together. 

As realistic networks have various structures, it is neces- 
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FIG. 3: (Color online) The security analysis for the western U.S. 
power grid, (a) The dependence of Gi,2 on (3, the optimal defense 
is found at about (3 g ~ 1.45. (b) The dependence of Bx,2 on (3, 
the optimal defense is found at about (3b ~ 1.75. Inset: pb versus (3, 
where pb is minimized at Pb- Each data is averaged over 10 attack re- 
alizations. For CA, the top 10 nodes of the highest load are attacked; 
while for DA, the nodes are attacked by an ascending order of their 
capacities. 



FIG. 4: (Color online) The security analysis for the Internet at the 



autonomous level, (a) The dependence of Gi,2 on (3, (3 g 



0.8. 



(b) The dependence of Bi,2 on /3, Pb ~ 1.1. Please note the semi- 
logarithmic plot of _Bi,2. Inset: pb versus /3, where pb is minimized 
at Pb- Each data is averaged over 10 attack realizations, just as we 
did in Fig. 3. 



sary to check the dependence of the optimal defense to the 
network parameters. In particular, we are going to check the 
dependence of (3b on the following network parameters: The 
tolerance parameter a, the average degree (fc), the degree ex- 
ponent 7, and the system size N. (The similar dependence is 
also valid for (3 g ). The numerical results are plotted in Fig. 
2. The general finding is that the value of (3b is increasing 
with N, but is decreasing with a, (k), and 7. (For RN, we 
have 7 — > 00.) Speaking alternatively, it is the larger, sparser 
and heterogeneous networks that will suffer more from the 
cost-based attacks and, correspondingly, will be more bene- 
fited from the optimal defense. Since infrastructure networks 
normally have the larger size and heterogeneous structure, the 
studies of optimal defense thus is of practical concern. 

How about the defense of realistic networks? To address 
this question, we have analyzed the securities of two typi- 
cal infrastructure networks in our society: (1) The electrical 
power grid of the western United States lfl4fl ; and (2) the 
Internet at the autonomous level 01511 . The power-grid net- 
work consists of N = 4941 nodes and has average degree 
(k) w 2.67, which has been widely used in literature as an 



example of cascade network The variations of Gi .2 as a 
function of (3 is plotted in Fig. 3(a), where the optimal de- 
fense is found at about (3 g « 1.45. In Fig. 3(b) we plot the 
dependence of B\i on (3, where the optimal defense is found 
at about (3b w 1.75. Like we did in Fig. 1, we have also 
calculated the dependence of the attack efficiency, pb, on the 
defense parameter (3, where pb is found to be minimized at 
(3b- The Internet we have employed consists of N = 6474 
nodes and has average degree (k) w 3.88. The variations of 
Gi,2 and B12 as a function of (3 are plotted in Fig. 4(a) and 
(b), respectively. For measurement G, the optimal defense is 
found at about (3 g w 0.8; while for measurement B, the opti- 
mal defense is found at about (3b ~ 1.1. Still, pb is minimized 
at (3b- It is interesting to see that, comparing to the standard 
SFN model [Fig. 1] and the power-grid network [Fig. 3], the 
Internet is less vulnerable to CA when (3 < (3 g in terms of 
measurement G [Fig. 4(a)]. We attribute this strange behav- 
ior to the unique topology of the Internet, e.g., the modular 
structure, the degree correlation, and the hierarchical prop- 
erty. This also verifies our previous finding of the dependence 
of optimal defense on network parameters [Fig. 2], 

Discussion and conclusion. - The main purpose of the 
present study is to highlight the variability and flexibility of 
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the network attacks in the real situation, so as to bring a cau- 
tion to the defense of complex networks. Our main finding 
is that, if the defense resources of a network are not well dis- 
tributed, the attacker could be benefited from choosing be- 
tween the attack strategies. In showing this, we had employed 
the simple model of cascade network and made a few assump- 
tions on the network defense and attack, which, when used 
to model the real situations, should be (carefully) modified 
and redefined. For instance, it has been shown recently that, 
as a balance of network robustness and frangibility, the rela- 
tionship between node capacity and load could be nonlinear 
I116h . This indicates that, to analyze the security of such a 
network, the constant tolerance parameter used in the current 
model should be modified. This kind of modifications, how- 
ever, will not change the general picture of optimal defense. 
The fact is that, as long as the cost factor of network attack is 
counted, optimal defense will exist and be an important issue 
in network security. 

A point which should be specially addressed is that the cur- 
rent model requires a full knowledge of the network, including 
the detail information about the network structure and flow 
dynamics. These information, while is available for some 
public systems such as the power-grid [17] and the Internet, 
is difficult to obtain for the secret networks, say, for example, 
the terror and Mafia networks. In a secret network, the impor- 
tant nodes, which possess the larger degree and have higher 
ranks in the hierarchy, are usually well covered and difficult 
to identify. This arouses the problem of attacking probability, 
a question investigated by Gallos et al. very recently d. In 
that study, the probability of removing a node is determined 
by three factors: The node degree k, the intrinsic network 
vulnerability a', and the node knowledge a" . There a key 



finding is that, as the information of the important nodes be 
gradually exposed (increasing the value of a"), the fraction of 
nodes needed to break the network will be quickly decreased. 
Here, an interesting thing is that, if we regard the cover of the 
network information as an approach of network defense, the 
study of Ref. lfl8ll and the present work have essentially the 
same basis. In particular, if we replace the parameter (3 in Eq. 
(HJ by a new parameter (a' + a ) /n (ft « 1.6 is the exponent 
that characterizes the relationship between the node capacity 
and degree fl9ll ). then the node defense defined in Eq. (Q]) 
is just the reciprocal of the node vulnerability defined in Ref. 
01811 . For this reason, we may say that the study in Ref. 11180 is 
a special case of the cost-based attacks proposed in the present 
work. Despite of this point of similarity, the two studies are 
actually dealing with very different problems. Simply speak- 
ing, the study of Ref. Ill 811 is focusing on the scale of network 
damage, in which the attack cost (information discovery) is 
variable and the attack strategy is always fixed to CA; in con- 
trast, the current study is dealing with the situation of variable 
attack strategy and fixed attack cost, i.e., it is a question about 
network optimization O20ll . 

Summarizing up, we have proposed the idea of cost-based 
attacks on complex networks and investigated the problem of 
optimal network defense. Different from previous studies, 
here we emphasize the initiative and flexibility of the attacker 
in implementing the attacks, which is a solid step forward to 
the realistic situations. We hope this study could stimulate 
new thinking to the security of complex networks, and give in- 
dications to the design and defense of infrastructure networks. 
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